Data for Waitlist Mortality

Pediatric HR Candidates

Author

R. Jerome Dixon, Michael D. Porter

Published

July 5, 2024

This notebook shows how the data for modeling waitlist mortality is generated.

TODO: 1. Combine candidate data with waitlist ranges to see if wide/narrow offer consideration impacts survival (e.g., only consider donors < 200mi)

Changes: 1. Added Body Surface Area (BSA) 2. Updated Center Stats Calculations

Get Population

Thoracic Data

The thoracic data has information collected on transplant candidates and recipients. The data was retrieved from UNOS on 2022-04-01.

Code
thoracic = read_rds(file.path(dir_data, "thoracic.rds"))
Code
censoring_date = as.Date("2022-04-01")

Outcomes

The REM_CD is the coded reason for removal from the waitlist. The details are in the STAR documentation.

  • The created outcome variable corresponds to an adverse outcome defined as death (REM_CD = 8) or Candidate too sick to transplant (REM_CD = 13, “Cand. cond. deteriorated,too sick to tx”).
    • outcome = 1 is adverse.
    • outcome = 0 is everything else (i.e., candidate was removed from the waitlist for anything besides death or deteriorization).
Code
REM_CD_code = readxl::read_excel(file.path(dir_data, "STAR File Documentation.xls"),
                                sheet = "THORACIC_FORMATS_FLATFILE",
                                skip=1) %>%
  filter(`SAS ANALYSIS FORMAT` == "REMCD") %>%
  select(
    REM_CD = `Data Field Value`,
    REMOVAL_REASON = `Data Field Formatted Value`
  ) %>% 
  mutate(
    outcome = 1L*(REM_CD %in% c("8", "13")),
    outcome_full = case_match(REM_CD,
                         #: adverse outcomes
                         "8" ~ "Death",
                         "13" ~ "Too sick", # Cand. cond. deteriorated,too sick to tx
                         #: Still alive or Tx'ed
                         c("2", "3", "4", "14", "15", "21", "23") ~ "Tx",
                         "6" ~ "Alive",  # Refused transplant
                         "7" ~ "Alive",  # Transferred to another center 
                         "12" ~ "Alive", # Cand. condition improved, tx not needed
                         "16" ~ "Alive", # Candidate Removed in Error
                         #: other
                         "10" ~ "Listed in Error",
                         "24" ~ "Lost contact",
                         .default = "Other"
                         )
  )

Population

1. Pediatric Candidate (Age < 18) on waitlist for heart

Code
thor1 = thoracic %>% 
  filter(
    WL_ORG == "HR",    
    INIT_AGE < 18, 
  )

2. Added to waitlist between 2010 - 2020

Code
date_rng = c("2010-01-01", "2020-12-31") %>% as.Date()
thor2 = thor1 %>% 
  filter(between(as.Date(INIT_DATE), date_rng[1], date_rng[2]))

There are 6244 pediatric waitlists between 2010-01-01 and 2020-12-31. There were 5972 unique candidates (since some were on multiple waitlists during this period).

3. No prior heart transplants

We can discuss allowing these candidates. My initial reason for exclusion is keep things simple. Otherwise, we’d need to keep track of the number of past transplants and the time from the last transplant (some are only a few days).

Code
thor3 = thor2 %>% 
  filter(
    NUM_PREV_TX == 0,    
    !(PREV_TX %in% "Y"),
    !(THORACIC_DGN %in% c(1700, 1100:1199)) # codes related to RE-TX/GF
    )  

This removed 399 and leaves 5845.

4. Patient’s first heart waitlist

Keep only the data from the first time a patient was listed. This will exclude the waitlists corresponding to patients i) who had previous heart transplants, ii) are listed multiple times, and iii) were transferred to another center.

We could keep some of these, but handling the multiple waitlists and transfers becomes a little tricky.

Code
first_wl = thoracic %>% 
  group_by(PT_CODE) %>% 
    slice_min(INIT_DATE, with_ties = FALSE) %>% 
  ungroup()
# For more precise identification of first waitling list, use the 
#   waitlist data: WL_PT %>% filter(seq_wl == 1)

thor4 = thor3 %>% semi_join(first_wl, by = "WL_ID_CODE")

This removed 176 and leaves 5669.

5. Remove waitlists with mistakes

The removal code corresponding to listing mistakes (REM_CD = 10 “Candidate listed in error”) are removed from the data.

Code
thor5 = thor4 %>% filter( !(REM_CD %in% c("10")) )

This removed 0 and leaves 5669.

6. Remove waitlists with unknown outcomes

The removal codes corresponding to candidates lost (REM_CD = 24 “Unable to contact candidate”) and candidates still alive at data collection/censored (REM_CD = NA) are removed from the data.

Code
thor6 = thor5 %>% filter( REM_CD != 24, !is.na(REM_CD))

This removed 146 and leaves 5523.

Summary of removal reasons for our population

This gives our final population

Code
population = thor6 %>% select(WL_ID_CODE, REM_CD) %>%   
  mutate(REM_CD = as.character(REM_CD)) %>% 
  left_join(REM_CD_code, by = "REM_CD")

write_csv(population, file.path(dir_save, "population.csv"))
Code
population %>% 
  count(REM_CD, sort=TRUE) %>% 
  mutate(p = n/sum(n)) %>% 
  left_join(REM_CD_code, by = "REM_CD") %>% 
  print_table()
REM_CD n p REMOVAL_REASON outcome outcome_full
4 4495 0.814 Deceased Donor tx, removed by tx center 0 Tx
8 338 0.061 Died 1 Death
12 317 0.057 Cand. condition improved, tx not needed 0 Alive
13 202 0.037 Cand. cond. deteriorated,too sick to tx 1 Too sick
9 75 0.014 Other 0 Other
7 71 0.013 Transferred to another center 0 Alive
6 11 0.002 Refused transplant 0 Alive
14 8 0.001 Tx at another center (multiple-listing) 0 Tx
16 3 0.001 Candidate Removed in Error 0 Alive
21 3 0.001 Patient died during TX procedure 0 Tx

And overall waitlist survival for our population (\(0\) is survive, \(1\) is death or deterioration):

Code
population %>%   
  count(outcome) %>% 
  mutate(p = n/sum(n)) %>% 
  print_table()
outcome n p
0 4983 0.902
1 540 0.098

Make Data for Modeling

Functions for variable recoding:

Code
## Functions for recoding variables.

#: Recoding Race/Ethnicity
recode_race <- function(ETHCAT){
  recode(ETHCAT, 
          `1` = "White", 
          `2` = "Black", 
          `4` = "Hispanic", 
          `5` = "Asian", 
          # `6` = "Amer Ind/Alaska Native", 
          # `7` = "Native Hawaiian/other Pacific Islander",
          # `9` = "Multiracial",
          .default = "Other", .missing = "Unknown"
          ) %>% factor()  
}

#: Recoding Blood Type
recode_ABO <- function(ABO){
  recode(ABO, A1 = "A", A2 = "A", A1B = "AB", A2B = "AB", 
         .missing = "Unknown") %>% factor()  
}

#: Recode Y = Yes, N = No, U = Unknown
recode_YNU <- function(YNU){
  recode(YNU, N = "No", Y = "Yes", U = "Unknown", .missing = "Unknown") %>% factor()
}

#: Recode P = Yes, N = No, anything else = Unknown
recode_PNO <- function(PNO){
  recode(PNO, N = "No", P = "Yes", .default = "Unknown", .missing = "Unknown") %>% factor()
}

recode_STATUS <- function(STATUS){
  case_match(STATUS, 
             2010 ~ "Status 1A", 
             2020 ~ "Status 1B",
             2030 ~ "Status 2", 
             2999 ~ "Inactive",
             .default = "Other"
             ) %>% 
    factor(levels = c("Status 1A", "Status 1B", "Status 2", "Inactive"), 
           ordered = TRUE)
}

Waitlist Data

  • TODO: consider calculating how “wide” the acceptance values are. E.g., will center consider donor 2x larger?

Load waitlist data

Code
WL = read_rds(file.path(dir_data, "cand_wl_hist.rds"))

Get waitlist predictor variables

Code
DATA_WL = WL %>% 
  filter(CHG_TY == "A") %>% # collect data when added to waitlist
  semi_join(population, by = "WL_ID_CODE") %>% 
  select(
    WL_ID_CODE, 
    # ORG, 
    # STATUS, 
    WL_DT = CHG_DT,   # Date-time candidate first added to waitlist
    PRELIM_XMATCH_REQ, 
    # INACT_REASON_CD,      # not many cases
    starts_with("DONCRIT"), 
    -DONCRIT_ACPT_HIST_CIG, # all missing
    -DONCRIT_GENDER_REQ,    # not many gender specified
  ) %>% 
  mutate(
    PRELIM_XMATCH_REQ = recode_YNU(PRELIM_XMATCH_REQ), 
    across(starts_with("DONCRIT_ACPT"), recode_YNU)
  ) 

EDA: Waitlist Data

Code
model_data = DATA_WL %>% full_join(population, by = "WL_ID_CODE")
vars = colnames(model_data) %>% setdiff(colnames(population))
walk(vars, ~plot_data(model_data, var = .x) %>% print)

Candidate Demographic Attributes

Get candidate demographic predictor variables

Code
candidate_demographic_data = thoracic %>% 
  semi_join(population, by="WL_ID_CODE") %>% 
  transmute(
    WL_ID_CODE, 
    #: Age at time of registration or listing
    AGE = pmax(INIT_AGE, 0),  #set -1 to age 0 
    #: Gender {M,F}
    GENDER = factor(GENDER),
    #: Race
    RACE = recode_race(ETHCAT),    
    #: Weight in kg
    WEIGHT_KG = coalesce(INIT_WGT_KG_CALC, WGT_KG_TCR), 
    #: Height in cm
    HEIGHT_CM = coalesce(INIT_HGT_CM_CALC, HGT_CM_TCR),
    #: BMI
    BMI = coalesce(INIT_BMI_CALC, BMI_TCR),
    #: Body Surface Area (BSA) using Mosteller's formula
    BSA = sqrt(HEIGHT_CM * WEIGHT_KG / 3600), 
    #: Blood Type
    ABO = recode_ABO(ABO),
    #: Citizenship
    CITIZENSHIP = case_match(CITIZENSHIP, 
                      c(1,2) ~ "US", # 2 ~ "Resident Alien",
                      #3 ~ "Non-resident Alien",
                      #c(4, 5) ~ "Non US",
                      6 ~ "Travel for Tx", 
                      .default = "Other"
                    )
  )

EDA: Candidate Demographics

Code
model_data = candidate_demographic_data %>% 
  full_join(population, by = "WL_ID_CODE")
vars = colnames(model_data) %>% setdiff(colnames(population))
walk(vars, ~plot_data(model_data, var = .x) %>% print)

Candidate Risk Level Attributes

Code to convert functional status code into description

Code
# Functional Status
library(readxl)
FUNC_STAT_dict = read_excel(
  file.path(dir_data, "STAR File Documentation.xls"), 
  sheet = "THORACIC_FORMATS_FLATFILE",
  skip = 1
) %>% 
  filter(`SAS ANALYSIS FORMAT` == "FUNCSTAT") %>% 
  transmute(
    code = `Data Field Value`,
    descr = `Data Field Formatted Value`
  ) %>% 
  # convert to integer code; drop non-numeric values
  filter(!is.na(code), code != "**OTHER**", code != "Null or Missing") %>% 
  mutate(code = as.integer(code)) 


recode_FUNC_STAT <- function(FUNC_STAT){
  levs = FUNC_STAT_dict %>% 
    filter(code %in% c(996, 998, 4000:4900)) %>% 
    pull(descr)
  ind = match(FUNC_STAT, FUNC_STAT_dict$code)
  factor(FUNC_STAT_dict$descr[ind], levels = levs)
}

Get candidate risk predictor variables

Code
candidate_risk_data = thoracic %>% 
  semi_join(population, by="WL_ID_CODE") %>% 
  transmute(
    WL_ID_CODE,
    STATUS = recode_STATUS(INIT_STAT),
    #: Life Support in general 
    LIFE_SUPPORT_CAND_REG = recode_YNU(LIFE_SUP_TCR),
    LIFE_SUPPORT_OTHER = OTH_LIFE_SUP_TCR,  # Binary
    PGE_TCR, # LIFE_SUPPORT_PGE = PGE_TCR             # Binary 
    #: ECMO (Life Support)
    ECMO_CAND_REG = ECMO_TCR,  # ifelse(ECMO_TCR == 1, "Yes", "No"),
    #: VAD 
    VAD_CAND_REG = case_when(
      VAD_DEVICE_TY_TCR == 1 ~ "No", 
      VAD_DEVICE_TY_TCR  > 1 ~ "Yes", 
      .default = "No"  # not many missing
    ),
    VAD_DEVICE_TY_TCR = case_match(VAD_DEVICE_TY_TCR,
                                   1~"NONE",
                                   2~"LVAD",
                                   3~"RVAD",
                                   4~"TAH",
                                   5~"LVAD+RVAD",
                                   6~"LVAD/RVAD/TAH Unspecified"
                                   ) %>% factor(),
    #: PATIENT ON LIFE SUPPORT - VENTILATOR
    VENTILATOR_CAND_REG = VENTILATOR_TCR, #ifelse(VENTILATOR_TCR == 1, "Yes", "No"),
    #: Life Support (Other)
    # IABP_TCR
    # IABP_TRR
    # PGE_TCR
    # PGE_TRR
    
    #: Functional Status 
    FUNC_STAT_CAND_REG = recode_FUNC_STAT(FUNC_STAT_TCR),

    # on other waitlists?
    WL_OTHER_ORG = case_when( 
      MULTIORG == "Y" ~ "Y",
      WLHL == "Y" ~ "Y",
      WLIN == "Y" ~ "Y",
      WLKI == "Y" ~ "Y",
      WLKP == "Y" ~ "Y",
      WLLI == "Y" ~ "Y",
      WLLU == "Y" ~ "Y",
      .default = "N"
    ) %>% recode_YNU(),
    
    #: transplant history
    # NUM_PREV_TX,         # number of prev HR TX (self-reported?)
    # PREV_TX_HR = PREV_TX,# {Y,N} previous HR TX (from data); NA if no Tx (data leakage warning: this is missing if not transplanted)
    # PREV_TX_ANY_ORGAN = PREV_TX_ANY,
    # DAYS_SINCE_PREV_TX_HR = PRVTXDIF, this is at time of TX
    
    #: Other
    CEREB_VASC = recode_YNU(CEREB_VASC), 
    DIAB = case_match(DIAB, 1~"None", 2~"Type I", 3~"Type II", c(4,5) ~ "Type Unknown", .default = "Unknown") %>% factor(),
    DIALYSIS_CAND = case_match(DIAL_TY_TCR, 1~"No", c(2,3)~"Yes", .default="Unknown") %>% factor(),
    HEMODYNAMICS_CO = HEMO_CO_TCR, 
    IMPL_DEFIBRIL = recode_YNU(IMPL_DEFIBRIL),
    # INHALED_NO,
    INOTROP_VASO_CO_REG = recode_YNU(INOTROP_VASO_CO_TCR), 
    INOTROPES_TCR,
    MOST_RCNT_CREAT, # CREATININE_REG = 
    eGFR = 0.412 * coalesce(INIT_HGT_CM_CALC, HGT_CM_TCR) / pmax(MOST_RCNT_CREAT, .001),
    TOT_SERUM_ALBUM,
  )

EDA: Candidate Risk

Code
model_data = candidate_risk_data %>% full_join(population, by = "WL_ID_CODE")
vars = colnames(model_data) %>% setdiff(colnames(population))
walk(vars, ~plot_data(model_data, var = .x) %>% print)

Candidate Diagnosis

  • The candidate diagnosis is found in the thoracic data.
    • THORACIC_DGN: Waitlist Candidate Diagnosis
    • TCR_DGN: Candidate Diagnosis at Listing
    • DIAG: Diagnosis from at Transplant and if missing, from TCR
    • TCR_DGN_OSTXT and DIAG_OSTXT give text diagnosis
  • Dr. Haregu developed the following coding for CAND_DIAG using THORACIC_DGN variable:
    • Codes: 1004,1006 ~ “Myocarditis”,
    • Codes: 1205, 1206 ~ “Congenital Heart Disease Without Surgery”,
    • Code: 1207 ~ “Congenital Heart Disease With Surgery”,
    • Codes: 1000, 1001, 1002, 1003, 1005, 1007, 1049, 1209 ~ “Dilated Cardiomyopathy”,
    • Codes: 1050, 1052, 1054, 1099 ~ “Restrictive Cardiomyopathy”,
    • Code: 1201 ~ “Hypertrophic Cardiomyopathy”,
    • Code: 1202 ~ “Valvular Heart Disease”,
    • Code: 999 and DIAG_OSTXT in Other_Cardiomyopathy ~ “Dilated Cardiomyopathy”
    • For all other codes: ~ “Other”

Load diagnosis codes

Code
## Read dictionary
library(readxl)
UNOS_dict = read_excel(
  file.path(dir_data, "STAR File Documentation.xls"), 
  sheet = "THORACIC_FORMATS_FLATFILE",
  skip = 1
) %>% 
  filter(`SAS ANALYSIS FORMAT` == "TH_DGN")%>% 
  transmute(
    code = as.numeric(`Data Field Value`),
    descr = `Data Field Formatted Value`
  ) %>% 
  filter(!is.na(code)) 

#: from SRTR: https://www.srtr.org/tools/posttransplant-outcomes/ "Additional Info" tab
SRTR_dict = read_csv(file.path(dir_data,"SRTR-candidate-diagnosis.csv")) %>% 
  transmute(
    CAND_DIAG_CODE = `OPTN Diagnosis Code`,
    SRTR_DIAG = str_remove(`SRTR Diagnosis Group`, "Heart: ")
  ) 

#: From McCulloch. If code == 999 (Other), then check free text
Other_Cardiomyopathy = 
c("ARRHYTHMOGENIC RIGHT VENTRICULAR CARDIOMYOPATHY", "ARRHYTHMOGENIC RV DYSPLASIA", 
  "ARRYTHMOGENIC BIVENTRICULAR DYSFUNCTION", "BARTH SYNDROME", 
  "BARTH SYNDROME, AND DCM", "BIVENTRICULAR NON-COMPACTION CARDIOMYOPATHY", 
  "CARDIOMYOPATHY", "CONGENITAL ARRHYTHMIA", "CONGENITAL COMPLETE HEART BLOCK", 
  "CONGENITAL HEART BLOCK", "CONGENITAL LONG QT SYNDROME", "HYPERTROPHIC", 
  "HYPERTROPHIC/DILATED CARDIOMYOPATHY", "KAWASAKI'S SYNDROME, CORONARY ARTERY ANEURYSMS", 
  "LEFT VENTRICULAR NON-COMPACTION", "LONG Q-T SYNDROME TYPE 3", 
  "LONG QT SYNDROME", "LV NON-COMPACTION", "LVNC/MIXED PHENOTYPE CARDIOMYOPATHY", 
  "MALIGANT VENTRICULAR TACHYCARDIA", "MALIGNANT ARRHYTHMIA", "MALIGNANT LONG QT SYNDROME, TYPE 3", 
  "MATERNAL SJOGRENS", "MIXED HYPERTROPHIC AND DILATED CARDIOMYOPATHY", 
  "MYOCARDITIS", "NON COMPACTION CARDIOMYOPATHY", "NON-COMPACTION", 
  "NON-COMPACTION CARDIOMYOPATHY", "NON-COMPACTION WITH RESTRICTIVE PHYSIOLOGY", 
  "POLYMORPHIC VENTRICULAR TACHYCARDIA", "PROLONGED QT SYNDROME", 
  "REFRACTORY VENTRICULAR ARRHYTHMIA", "SUPRAVENTRICULAR TACHYCARDIA, WOLFF-PARKINSON WHIT", 
  "VENTRICULAR ARRYTHMIA", "VTACH/SVT")

Other_CHD = 
c("CHD, TRICUSPID ATRESIA", "CONGENITAL HEART DISEASE, PULMONARY ATRESIA", 
  "CONGESTIVE HEART FAILURE", "EBSTEINS ANOMALY", "HYPOPLASTIC AORTIC ARCH", 
  "TETRALOGY OF FALLOT", "TETRALOGY OF FALLOT - CONGENITAL", "TETRALOGY OF FALLOT WITH VSD", 
  "TRICUSPID ATRESIA", "UNBALANCED AV CANAL", "UNBALANCED AV CANAL WITH HYPOPLASTIC AORTIC ARCH"
)

recode_DIAG <- function(CODE, TXT){
  case_when(
      CODE %in% c(1004,1006) ~ "Myocarditis",
      CODE %in% c(1205, 1206) ~ "Congenital Heart Disease Without Surgery",
      CODE == 1207 ~ "Congenital Heart Disease With Surgery",
      CODE %in% c(1000, 1001, 1002, 1003, 1005, 1007, 1049, 1209) ~        "Dilated Cardiomyopathy",
      CODE %in% c(1050, 1052, 1054, 1099) ~ "Restrictive Cardiomyopathy",
      CODE == 1201 ~ "Hypertrophic Cardiomyopathy",
      CODE == 1202 ~ "Valvular Heart Disease",
      CODE == 999 & (TXT %in% Other_Cardiomyopathy) ~ "Dilated Cardiomyopathy",
      #CAND_DIAG_CODE == 999 & CAND_DIAG_TXT %in% Other_CHD ~ "Congenital Heart Disease",
      .default =  "Other" ) %>% 
    factor()  
}

Get candidate diagnosis predictors

Code
candidate_diagnosis = thoracic %>% 
  semi_join(population, by="WL_ID_CODE") %>% 
  transmute(
    WL_ID_CODE,
    CAND_DIAG = recode_DIAG(THORACIC_DGN, TCR_DGN_OSTXT),
    CAND_DIAG_LISTING = recode_DIAG(TCR_DGN, TCR_DGN_OSTXT),
    CAND_DIAG_CODE = str_c(THORACIC_DGN,": ", CAND_DIAG) %>% factor()
    ) 

EDA: Candidate Diagnosis

Code
model_data = candidate_diagnosis %>% full_join(population, by = "WL_ID_CODE")
vars = colnames(model_data) %>% setdiff(colnames(population))
walk(vars, ~plot_data(model_data, var = .x) %>% print)

Outcomes using the individual diagnosis codes:

Code
tmp = candidate_diagnosis %>%
  left_join(thoracic %>% select(WL_ID_CODE, THORACIC_DGN, TCR_DGN)) %>% 
  left_join(UNOS_dict, by = c("THORACIC_DGN" = "code")) %>% # descr.x
  left_join(UNOS_dict, by = c("TCR_DGN" = "code")) %>%      # descr.y
  transmute(
    WL_ID_CODE,
    CAND_DIAG, 
    CAND_DIAG_LISTING,
    CAND_DIAG_CODE_WL = str_c(THORACIC_DGN,": ", descr.x) %>% factor(),
    CAND_DIAG_CODE_REG = str_c(TCR_DGN,": ", descr.y) %>% factor()
  ) %>% 
  left_join(population, by = "WL_ID_CODE")
Code
tmp %>% 
  count(CAND_DIAG_CODE_WL, CAND_DIAG, outcome) %>% 
  spread(outcome, n, fill=0) %>% 
  mutate(p = (`0` + .9*5) / (`0` + `1` + 5)) %>% 
  arrange(p) %>% 
  print_table()
CAND_DIAG_CODE_WL CAND_DIAG 0 1 p
1202: VALVULAR HEART DISEASE Valvular Heart Disease 9 4 0.750
1054: RESTRICTIVE MYOPATHY: SEC TO RADIAT/CHEM Restrictive Cardiomyopathy 7 2 0.821
1206: CONGENITAL HEART DEFECT - WITHOUT SURGERY Congenital Heart Disease Without Surgery 280 55 0.837
1205: CONGENITAL HEART DEFECT - HYPOPLASTIC LEFT HEART SYNDROME - UNOPERATED Congenital Heart Disease Without Surgery 35 6 0.859
1207: CONGENITAL HEART DEFECT - WITH SURGERY Congenital Heart Disease With Surgery 2124 326 0.867
999: OTHER - SPECIFY Other 57 8 0.879
1203: CONGENITAL HEART DEFECT - PRIOR SURGERY UNKNOWN Other 7 1 0.885
1002: DILATED MYOPATHY: POST PARTUM Dilated Cardiomyopathy 8 1 0.893
1004: DILATED MYOPATHY: MYOCARDITIS Myocarditis 180 19 0.904
999: OTHER - SPECIFY Dilated Cardiomyopathy 32 3 0.912
1005: DILATED MYOPATHY: ALCOHOLIC Dilated Cardiomyopathy 1 0 0.917
1204: CANCER Other 1 0 0.917
1099: RESTRICTIVE MYOPATHY: OTHER SPECIFY Restrictive Cardiomyopathy 48 4 0.921
1052: RESTRICTIVE MYOPATHY: ENDOCARDIAL FIBROS Restrictive Cardiomyopathy 2 0 0.929
1001: DILATED MYOPATHY: ADRIAMYCIN Dilated Cardiomyopathy 42 3 0.930
1201: HYPERTROPHIC CARDIOMYOPATHY Hypertrophic Cardiomyopathy 174 12 0.935
Other 3 0 0.938
1049: DILATED MYOPATHY: OTHER SPECIFY Dilated Cardiomyopathy 280 17 0.942
1050: RESTRICTIVE MYOPATHY: IDIOPATHIC Restrictive Cardiomyopathy 233 14 0.942
1209: MUSCULAR DYSTROPHY: OTHER SPECIFY Dilated Cardiomyopathy 5 0 0.950
1000: DILATED MYOPATHY: IDIOPATHIC Dilated Cardiomyopathy 1178 58 0.953
1006: DILATED MYOPATHY: VIRAL Myocarditis 26 1 0.953
1200: CORONARY ARTERY DISEASE Other 10 0 0.967
1003: DILATED MYOPATHY: FAMILIAL Dilated Cardiomyopathy 191 6 0.968
1208: ARRHYTHMOGENIC RIGHT VENTRICULAR DYSPLASIA/CARDIOMYOPATHY Other 19 0 0.979
1007: DILATED MYOPATHY: ISCHEMIC Dilated Cardiomyopathy 31 0 0.986

Center Level Predictors

Listing Center Volume

  • Calculate the number of pediatric heart transplants each year by listing center
Code
#: Number of pediatric hr transplants in previous year
TX_YR = thoracic %>% 
  filter(
    coalesce(AGE, INIT_AGE) < 18,     # Age of transplant recipient
    ORGAN == "HR"
  ) %>% 
  count(LISTING_CTR_CODE, YR = TX_YEAR) %>% 
  complete(LISTING_CTR_CODE, YR, fill = list(n=0)) %>% 
  group_by(LISTING_CTR_CODE) %>% 
    arrange(YR) %>% 
    mutate(
      LISTING_CTR_PEDHRTX_PREV_YR = dplyr::lag(n, default=0)
    ) %>% 
  ungroup() %>% 
  select(-n)

Listing Center Practice

NOTE: These are averaged over 2010-2019. This can make things a bit misleading as centers that did good will continue to have success.

  • median_refusals: median number of offer refusals per candidate. Averaged over 2010-2019 by listing center

  • median_wait_days: median number of days candidate needs to wait until first offer. Averaged over 2010-2019 by listing center.

  • median_wait_days_1A: median number of days that a Status 1A candidate needs to wait until first offer. Averaged over 2010-2019 by listing center. This will help standardize over centers that have disproportionate number of Status 2’s (for example).

  • median_wait_days_status: median number of days candidate needs to wait until first offer (by listing status; 1A will wait less than 1B). Averaged over 2010-2019 by listing center

  • LIST_YR: year of waitlist

  • REGION: UNOS Region

Load data to calculate center stats

Code
refusals = read_csv(file.path(dir_data, "refusals.csv"))
WT = read_csv(file.path(dir_data, "waiting_time.csv"))
LC_effects = read_csv(file.path(dir_data, "LC_effects.csv"))

Make center level predictors

Code
center_data = thoracic %>% 
  semi_join(population, by="WL_ID_CODE") %>% 
  transmute(
    WL_ID_CODE, 
    LISTYR,       # year of listing
    #: Center
    LISTING_CTR_CODE,
    #: UNOS Region
    REGION, 
  ) %>% 
  # Listing center volume (in previous year)
  left_join(
    TX_YR,
    by = c("LISTYR" = "YR", "LISTING_CTR_CODE")
  ) %>% 
  # Refusal rates
  left_join(
    refusals %>% select(LISTING_CTR_CODE, median_refusals, 
                        mean_refusals, median_refusals_old, p_refusals),
    by = "LISTING_CTR_CODE"
    ) %>% 
  # add median waiting time 
  left_join(WT, by = "LISTING_CTR_CODE") %>% 
  # add listing center effects from cox-ph
  left_join(LC_effects, by = "LISTING_CTR_CODE") %>% 
  # add STATUS for next calculation
  left_join(
    candidate_risk_data %>% select(WL_ID_CODE, STATUS),
    by = "WL_ID_CODE"
  ) %>% 
  # format
  transmute(
    WL_ID_CODE, 
    LISTING_CTR_CODE = factor(LISTING_CTR_CODE),
    LIST_YR = LISTYR,
    REGION = factor(REGION, 1:11), 
    pedhrtx_prev_yr = coalesce(LISTING_CTR_PEDHRTX_PREV_YR, 0), 
    median_refusals,
    median_refusals_old,
    mean_refusals,
    p_refusals,
    LC_effect,
    median_wait_days,
    median_wait_days_1A,
    median_wait_days_STATUS = case_match(STATUS,
                                  "Status 1A" ~ median_wait_days_1A, 
                                  "Status 1B" ~ median_wait_days_1B, 
                                  "Status 2"  ~ median_wait_days_2, 
                                  "Inactive"  ~ median_wait_days_2,  
                                  )
  )

EDA: Listing Center Variables

Code
model_data = center_data %>% full_join(population, by = "WL_ID_CODE")
vars = colnames(model_data) %>% setdiff(colnames(population))
walk(vars, ~plot_data(model_data, var = .x) %>% print)

Table of Listing Center outcomes (using Bayesian estimation to help estimation of centers with few candidates):

Code
thoracic %>% 
  inner_join(population, by="WL_ID_CODE") %>% 
  count(LISTING_CTR_CODE, outcome) %>% 
  spread(outcome, n, fill=0) %>% 
  mutate(
    n = `0` + `1`,
    p = (`0` + .9*5) / (`0` + `1` + 5)
  ) %>% 
  arrange(p) %>% 
  left_join(
    center_data %>% 
      distinct(LISTING_CTR_CODE, median_refusals, median_wait_days_1A),
    by = "LISTING_CTR_CODE"
  ) %>% 
  print_table()
LISTING_CTR_CODE 0 1 n p median_refusals median_wait_days_1A
24521 0 2 2 0.643
04464 88 32 120 0.740 13 10.0
19716 0 1 1 0.750 4
23622 0 1 1 0.750 4 0.0
25644 10 3 13 0.806 8 12.0
04774 23 6 29 0.809 4 25.5
14942 23 6 29 0.809 15 4.0
02728 121 28 149 0.815 13 9.0
01271 12 3 15 0.825 8 83.0
03906 65 14 79 0.827 4 27.0
04557 32 7 39 0.830 5 16.0
11749 3 1 4 0.833 4
19499 19 4 23 0.839 16 9.0
00279 56 11 67 0.840 11 14.0
06572 32 6 38 0.849 8 12.0
06820 49 9 58 0.849 5 9.0
00248 185 33 218 0.850 15 13.5
12524 67 12 79 0.851 8 9.0
03813 76 13 89 0.856 6 10.0
04712 54 9 63 0.860 15 22.0
11532 42 7 49 0.861 13 8.0
15159 50 8 58 0.865 4 10.0
23312 148 23 171 0.866 10 7.0
06603 58 9 67 0.868 3 11.0
08494 39 6 45 0.870 16 14.0
25657 33 5 38 0.872 1 8.5
09362 72 10 82 0.879 2 13.0
22289 31 4 35 0.887 3 13.0
22878 39 5 44 0.888 6 10.5
19468 208 26 234 0.889 5 9.0
13485 166 20 186 0.893 4 8.0
20677 108 13 121 0.893 5 9.0
05704 35 4 39 0.898 5 22.0
25110 22 2 24 0.914 7 12.0
00868 131 12 143 0.916 8 7.0
01395 1 0 1 0.917 14 43.0
08587 1 0 1 0.917 6
11067 1 0 1 0.917 8 1.0
11191 1 0 1 0.917 2
12834 1 0 1 0.917
13051 1 0 1 0.917 2 4.0
13640 1 0 1 0.917 6 28.0
15283 1 0 1 0.917 6 29.0
18786 1 0 1 0.917 2
19437 232 21 253 0.917 4 13.0
20522 1 0 1 0.917 5
22692 1 0 1 0.917
23033 1 0 1 0.917 6
23839 1 0 1 0.917 2 8.0
24304 1 0 1 0.917
25420 1 0 1 0.917 6 1.0
25711 12 1 13 0.917 3 42.5
26537 1 0 1 0.917
15531 101 9 110 0.917 3 19.0
00465 193 17 210 0.919 4 10.0
08277 150 13 163 0.920 7 13.0
07347 71 6 77 0.921 4 12.5
00961 2 0 2 0.929 2 11.5
05642 41 3 44 0.929 2 18.5
09548 2 0 2 0.929 26
16430 2 0 2 0.929 0 1.0
18259 2 0 2 0.929 3 9.0
18755 2 0 2 0.929 8 1.0
19530 2 0 2 0.929 3 11.5
06417 120 9 129 0.929 4 7.0
25518 68 5 73 0.929 3 17.0
13268 113 8 121 0.933 5 9.0
05053 115 8 123 0.934 7 4.0
07471 59 4 63 0.934 4 6.5
02976 60 4 64 0.935 3 24.0
23901 119 8 127 0.936 3 19.0
00124 239 16 255 0.937 5 5.0
02418 3 0 3 0.938 4 0.5
04867 3 0 3 0.938 2 11.0
20863 111 7 118 0.939 4 9.0
19034 19 1 20 0.940 6 13.0
11935 4 0 4 0.944 1 14.5
14477 4 0 4 0.944 5 1.0
23591 4 0 4 0.944 2
02883 124 7 131 0.945 2 12.0
13795 202 11 213 0.947 4 9.0
05890 96 5 101 0.948 4 16.0
18414 174 9 183 0.949 2 21.0
05487 5 0 5 0.950 1 64.0
25841 25 1 26 0.952 29 17.5
15872 70 3 73 0.955 3 13.0
11129 7 0 7 0.958 1 2.0
00403 204 8 212 0.961 3 16.5
18972 8 0 8 0.962 1 6.5
10478 38 1 39 0.966 3 14.0
24800 10 0 10 0.967 4 22.5
03503 11 0 11 0.969 5 4.5
02666 43 0 43 0.990 4 12.0

Final Survival Analysis Dataset

Combine all predictor variables into model_data:

Code
model_data = list(
  population %>% select(outcome, WL_ID_CODE), 
  candidate_demographic_data,
  candidate_risk_data,
  candidate_diagnosis, 
  center_data,
  DATA_WL
) %>% 
  purrr::reduce(left_join, by = "WL_ID_CODE")

Save model_data.{csv, rds}

model_data %>%
  write_csv(file.path(dir_save, "model_data.csv"))

model_data %>%
  write_rds(file.path(dir_save, "model_data.rds"))